
We claim: 

1 . A method of performing a task on a computer system having a cache, 
wherein the task comprises a plurality of sub-tasks, the method comprising: 

(a) receiving a plurality of requests for the task; 

(b) for each of the requests, storing in a work packet data for performing a sub- 
task of the task; 

(c) storing each work packet in a holding area; 

(d) executing the sub-task on each of the work packets in the holding while 
refraining executing other sub-tasks of the procedure, thereby maintaining locality of 
instructions and data in the cache; and 

(e) repeating steps b, c and d for each sub-task of the task until all of the sub- 
tasks of the task are completed for each request. 

2. The method of claim 1, wherein each sub- task has a type of work 
packet defined for it, wherein the work packet defined for the sub-task includes data 
and functions for performing the sub-task. 

3. The method of claim 1, wherein the holding area is a queue. 

4. The method of claim 1, wherein the holding area is a stack. 



5. The method of claim 1, wherein at least one of the executed work 
packets is a parent work packet, the method further comprising: creating a child work 



packet for the parent work packet; and performing a sub-task of the plurality of sub- 
tasks on the child work packet. 

6. A computer-readable medium having stored thereon computer- 
5 executable instructions for performing the method of claim 1 . 

7. The method of claim 1, wherein step (b) further comprises storing in 
the work packet a pointer to the data for performing the sub-task. 

8. The method of claim 5, wherein the sub-task performed on the child 
work packet is different from the sub-task performed on the parent work packet. 

9. The method of claim 5, wherein the sub-task performed on the child 
work packet is the same as the sub-task performed on the parent work packet. 

10. The method of claim 5, wherein the sub-task being performed on the 
parent work packet is halted until the sub-task being performed on the child work 
packet is completed. 
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1 1 . The method of claim 5, wherein the sub-task being performed on the 
parent work packet is halted until a predefined event occurs. 




12. A method of performing a task on a computer system having a cache, 
wherein the task comprises a plurality of sub-tasks, the method comprising: 

creating an instance of a stage for each sub-task, wherein the stage has an 
associated holding area; placing one or more work packets in the holding area, each 
work packet corresponding to an iteration of the sub-task required for the task, each 
work packet containing data for performing the sub-task; and, performing the sub-task 
on each work packet in the holding area so that the sub-task is repeatedly performed, 
thereby maintaining data locality in the cache for the sub-task. 



13. The method of claim 12, wherein the stage permits an instance of itself 
to be created on only a single processor in the computer system at a time. 



14. The method of claim 12, wherein the stage includes a local data area, 
and wherein the stage regulates which processor is permitted to create an instance of 
it based on the part of the local data area is required to be accessed. 



15. The method of claim 12, wherein, for at least one of the stage instances 
created, there is at least one work packet that is a parent work packet, the method 
further comprising: creating a child work packet for the parent work packet; sending 
the child work packet to another stage instance; and; at the other stage instances, 
performing a sub-task of the plurality of sub-tasks on the child work packet. 




16. The method of claim 12, wherein the holding area includes a stack and 
a queue, and the method further comprises: for each work packet received by at least 
one stage instance, putting the work packet in the stack if the work packet originated 
from the processor on which the instance of the stage is created, and putting the work 

5 packet in the queue if the work packet originated from another processor. 

17. The method of claim 12, wherein each work packet contains at least 
one pointer to data for performing the sub-task. 

10 18. A computer-readable medium having stored thereon computer- 

executable instructions for performing the method of claim 12. 

19. A system for executing a procedure on a computer, wherein the 
procedure is divided into a plurality of sub-tasks, the system comprising: a computer- 

15 readable medium having stored there on a plurality of work packets, each work packet 
including data usable to perform an iteration of a sub-task of the plurality; a 
computer-readable medium having stored thereon a plurality of stages, there being at 
least one stage for each sub-task, each stage comprising a holding area for holding a 
batch of the plurality of work packets; and a processor for identifying a stage of the 

20 plurality of stages and performing an iteration of the stage's sub-task on each of the 
batch of work packets, thereby maintaining the locality of data in a cache of the 
processor. 




20. The system of claim 19, wherein the processor is one of a plurality of 
processors and wherein at least one stage of the plurality is instantiated on at least two 
of the plurality of processors. 

21. The system of claim 19, wherein the holding area of the identified stage 
includes a queue and a stack, wherein the queue is for holding work packets that 
originated from processors other than the one on which an instance of the identified 
stage is created, wherein the stack is for holding work packets that originated from the 
processor on which the instance of the stage is created. 

22. The system of claim 19, wherein the identified stage has a local data 
area that is divided into sections, and wherein the processor determines whether to 
perform the sub-task of the stage based on the section of the local data area to which 
each work packet of the batch will require access. 

23. The system of claim 19, wherein each work packet contains 
instructions necessary to perform the sub-task of the stage at which it is held. 

24. The system of claim 19, wherein each work packet contains at least one 
pointer to instructions necessary to perform the sub-task of the stage at which it is 
held. 
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25. The system of claim 19, wherein each stage contains instructions 
necessary to perform its sub-task. 

26. The system of claim 19, wherein the processor is one of a plurality of 
5 processors, wherein the holding area of each stage is one of a plurality of holding 

areas for the stage, and wherein each of the plurality of holding areas of a stage is 
associated with one of the plurality of processors. 



27. The system of claim 19, wherein the holding area is a queue. 
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28. The system of claim 19, wherein the holding area is a priority queue. 



29. The system of claim 19, wherein the holding area is a stack. 
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